Examining The Consensus Between Human Summaries: Initial Experiments With Factoid Analysis

نویسندگان

  • Hans van Halteren
  • Simone Teufel
چکیده

We present a new approach to summary evaluation which combines two novel aspects, namely (a) content comparison between gold standard summary and system summary via factoids, a pseudo-semantic representation based on atomic information units which can be robustly marked in text, and (b) use of a gold standard consensus summary, in our case based on 50 individual summaries of one text. Even though future work on more than one source text is imperative, our experiments indicate that (1) ranking with regard to a single gold standard summary is insufficient as rankings based on any two randomly chosen summaries are very dissimilar (correlations average ρ = 0.20), (2) a stable consensus summary can only be expected if a larger number of summaries are collected (in the range of at least 30-40 summaries), and (3) similarity measurement using unigrams shows a similarly low ranking correlation when compared with factoid-based ranking.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Agreement in Human Factoid Annotation for Summarization Evaluation

Factoid analysis was introduced by (van Halteren and Teufel, 2003) as an objective, yet semantics-oriented way of measuring overlap of information rather than surface strings in summaries. In this paper, we report on annotation experiments with two sets of summaries, and on a factoid-pairing program which finds correlations between factoids semi-automatically.

متن کامل

Evaluating Information Content by Factoid Analysis: Human annotation and stability

We present a new approach to intrinsic summary evaluation, based on initial experiments in van Halteren and Teufel (2003), which combines two novel aspects: comparison of information content (rather than string similarity) in gold standard and system summary, measured in shared atomic information units which we call factoids, and comparison to more than one gold standard summary (in our data: 2...

متن کامل

ارائه یک سیستم هوشمند و معناگرا برای ارزیابی سیستم های خلاصه ساز متون

Nowadays summarizers and machine translators have attracted much attention to themselves, and many activities on making such tools have been done around the world. For Farsi like the other languages there have been efforts in this field. So evaluating such tools has a great importance. Human evaluations of machine summarization are extensive but expensive. Human evaluations can take months to f...

متن کامل

A Comparative Study of Alkaline Hydrolysis of Ethyl Acetate Using Design of Experiments

Alkaline hydrolysis of ethyl acetate is essentially an irreversible and second order reaction. Industrial importance of the reaction product, sodium acetate, necessitate for process improvement in terms of maximum conversion and economical usage of raw materials. Statistical design of experiments was utilized to enhance conversion in both batch and plug flow reactors.A full two level factor...

متن کامل

An Analysis of User Strategies for Examining and Processing Ranked Lists of Documents

The predominant display of document retrieval results is a ranked list of query-biased summaries. When examining and processing search results, users must make complex decisions about how to allocate their time and relevance judging effort between evaluation of summaries and the full documents reachable with a mouse click on a summary. We performed a cluster analysis of the search results proce...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003